Tutorial (Part 1): Visualizing Attacks on a Honey Pot

In this first part, we:

Import Graphistry and load a CSV of log entries
Visualize it as a graph by treating each row as an edge
Color edges using a categorical palette based on the kind of alert
Create a nodes table to control node sizes and colors

You can download this notebook to run it locally.



In [1]:

    
import pandas
import graphistry

# You can also set your API key once for all in the enviroment variable "GRAPHISTRY_API_KEY".
#graphistry.register(key='<go to www.graphistry.com/api-request to get a key>', server='labs.graphistry.com')

Load data with Pandas



In [3]:

    
logs = pandas.read_csv('.././../data/honeypot.csv')
logs[:3] # Show the first three rows of the loaded dataframe









    Out[3]:







  
    
      
      attackerIP
      victimIP
      victimPort
      vulnName
      count
      time(max)
      time(min)
    
  
  
    
      0
      1.235.32.141
      172.31.14.66
      139.0
      MS08067 (NetAPI)
      6
      1.421434e+09
      1.421423e+09
    
    
      1
      105.157.235.22
      172.31.14.66
      445.0
      MS08067 (NetAPI)
      4
      1.422498e+09
      1.422495e+09
    
    
      2
      105.186.127.152
      172.31.14.66
      445.0
      MS04011 (LSASS)
      1
      1.419966e+09
      1.419966e+09

Dates in time(max) and time(min) are unix timestamps. Pandas helps parse them.



In [3]:

    
logs['time(max)'] = pandas.to_datetime(logs['time(max)'], unit='s')
logs['time(min)'] = pandas.to_datetime(logs['time(min)'], unit='s')
logs[:3]









    Out[3]:






  
    
      
      attackerIP
      victimIP
      victimPort
      vulnName
      count
      time(max)
      time(min)
    
  
  
    
      0
      1.235.32.141
      172.31.14.66
      139
      MS08067 (NetAPI)
      6
      2015-01-16 18:39:37
      2015-01-16 15:37:49
    
    
      1
      105.157.235.22
      172.31.14.66
      445
      MS08067 (NetAPI)
      4
      2015-01-29 02:15:35
      2015-01-29 01:25:55
    
    
      2
      105.186.127.152
      172.31.14.66
      445
      MS04011 (LSASS)
      1
      2014-12-30 18:59:20
      2014-12-30 18:59:20

Minimal Graph

To create a graph, we bind the columns attackerIP and victimIP to indicate the start/end of edges. The result is a graph of IPs connected by log entries.



In [4]:

    
g = graphistry.bind(source='attackerIP', destination='victimIP').edges(logs)
g.plot()









    Out[4]:

Coloring edges by Vulnerabilities

We compute desired edge colors by creating a new column (ecolor) by assigning each vulnerability name to a different color code. We then tell the plotter to override the default edge coloring by binding our data to the attribute edge_color.

See the list of color codes at https://graphistry.github.io/docs/legacy/api/0.9.2/api.html#extendedpalette



In [5]:

    
vulnerabilityToColorCode = {vulnName: idx for idx, vulnName in enumerate(logs.vulnName.unique())}
vulnerabilityToColorCode









    Out[5]:





{'DCOM Vulnerability': 7,
 'HTTP Vulnerability': 8,
 'IIS Vulnerability': 3,
 'MS04011 (LSASS)': 1,
 'MS08067 (NetAPI)': 0,
 'MYDOOM Vulnerability': 4,
 'MaxDB Vulnerability': 2,
 'SYMANTEC Vulnerability': 5,
 'TIVOLI Vulnerability': 6}



In [6]:

    
edges = logs.copy() # Copy the original data to avoid unintended modifications.
#Set an edge's color to the value in the vulnerability lookup table
edges['ecolor'] = edges.vulnName.map(lambda vulnName: vulnerabilityToColorCode[vulnName])
edges[:3]









    Out[6]:







  
    
      
      attackerIP
      victimIP
      victimPort
      vulnName
      count
      time(max)
      time(min)
      ecolor
    
  
  
    
      0
      1.235.32.141
      172.31.14.66
      139.0
      MS08067 (NetAPI)
      6
      1.421434e+09
      1.421423e+09
      0
    
    
      1
      105.157.235.22
      172.31.14.66
      445.0
      MS08067 (NetAPI)
      4
      1.422498e+09
      1.422495e+09
      0
    
    
      2
      105.186.127.152
      172.31.14.66
      445.0
      MS04011 (LSASS)
      1
      1.419966e+09
      1.419966e+09
      1



In [7]:

    
# Finally, add the binding of ecolor to edge colors and plot
g2 = g.bind(edge_color='ecolor')
g.plot(edges)









    Out[7]:

Controlling Node Attributes by Creating a Node Table

To set the size and colors of nodes we need to create a node table where each node is represented by a row.

We gather a list of all nodes by concatenating the unique values of the source and destination columns of the edge table. This lists our node identifiers and will be the fist column of the node table.
Then we add an additional column to the node table for each visual attribute such as color or size.
Finally, we tell the plotter what to bind as the node identifier column and for any desired visual attributes.

We proceed in a few steps: collect all attacker IPs and color them red, collect all victim IPs and color them yellow, and then concatenate the IPs together into one table.



In [8]:

    
#Create the table of attackers. Our node identifier column will be called "IP".
attackers = edges.attackerIP.to_frame('IP')
attackers['type'] = 'attacker'
attackers['pcolor'] = 67006  #red
attackers[:3]









    Out[8]:







  
    
      
      IP
      type
      pcolor
    
  
  
    
      0
      1.235.32.141
      attacker
      67006
    
    
      1
      105.157.235.22
      attacker
      67006
    
    
      2
      105.186.127.152
      attacker
      67006



In [9]:

    
# Sames steps but for victims (destinations)
victims = edges.victimIP.to_frame('IP')
victims['type'] = 'victim'
victims['pcolor'] = 67001  #yellow
victims[:3]









    Out[9]:







  
    
      
      IP
      type
      pcolor
    
  
  
    
      0
      172.31.14.66
      victim
      67001
    
    
      1
      172.31.14.66
      victim
      67001
    
    
      2
      172.31.14.66
      victim
      67001



In [10]:

    
#Combine the two tables
#If an IP is both an attacker and a victim, prioritize coloring it as an attacker
nodes = pandas.concat([attackers, victims], ignore_index=True).drop_duplicates('IP')
nodes[:4]









    Out[10]:







  
    
      
      IP
      type
      pcolor
    
  
  
    
      0
      1.235.32.141
      attacker
      67006
    
    
      1
      105.157.235.22
      attacker
      67006
    
    
      2
      105.186.127.152
      attacker
      67006
    
    
      3
      105.227.98.90
      attacker
      67006



In [11]:

    
# We can now pass both the edge and node tables to "plot".
g2.bind(node='IP', point_color='pcolor').plot(edges, nodes)









    Out[11]:

Exploring Graphs Interactively: Summarize, Filter, Drill Down, and Compare

Within the visualization, you can now filter and drill down into the graph.

For cool results, try to:

Open the histogram panel, and add histograms for victimPort, vulnName, and count. By selecting a region of a histogram or clicking on a bar, you can filter the graph. For instance, we see that though the NetApi vulnerability is the biggest bar and therefore the most common vulnerability. By clicking on its bar and filtering to only those, we see that is only present in the big cluster of attacks again IP 172.31.14.66. (Click again to remove the filter.)

With the histogram panel open, click on data brush and then lasso a selection on the graph. The histograms highlight the subset of nodes under the selection. You can drag the data brush selection to compare different subgraphs. For example, we see that the the attackers did not find many vulnerabilities in the smaller part of the honeypot.

Going Further

In the next part of the tutorial, we show

Creating multiple graph views of the same data
Aggregating multi-edges into bundles

	attackerIP	victimIP	victimPort	vulnName	count	time(max)	time(min)
0	1.235.32.141	172.31.14.66	139.0	MS08067 (NetAPI)	6	1.421434e+09	1.421423e+09
1	105.157.235.22	172.31.14.66	445.0	MS08067 (NetAPI)	4	1.422498e+09	1.422495e+09
2	105.186.127.152	172.31.14.66	445.0	MS04011 (LSASS)	1	1.419966e+09	1.419966e+09

	IP	type	pcolor
0	1.235.32.141	attacker	67006
1	105.157.235.22	attacker	67006
2	105.186.127.152	attacker	67006

	IP	type	pcolor
0	172.31.14.66	victim	67001
1	172.31.14.66	victim	67001
2	172.31.14.66	victim	67001